{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f04dd603-a2d1-48ce-8c17-9f1dba8de1ee",
   "metadata": {},
   "source": [
    "Chapter 14\n",
    "\n",
    "# 两个格拉姆矩阵的谱分解还原SVD\n",
    "《线性代数》 | 鸢尾花书：数学不难"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ccafb456-2453-4c82-8a65-b1963a370cb2",
   "metadata": {},
   "source": [
    "这段代码完整演示了如何手动实现一个实矩阵 $A$ 的奇异值分解（Singular Value Decomposition, 简称 SVD），并用分解结果重构原始矩阵。整个过程从数学角度可以分为几个核心步骤：\n",
    "\n",
    "---\n",
    "\n",
    "### $1.$ 初始化矩阵 $A$\n",
    "\n",
    "首先，定义一个 $3 \\times 2$ 的原始矩阵 $A$：\n",
    "\n",
    "$$\n",
    "A = \\begin{bmatrix}\n",
    "0 & 1 \\\\\n",
    "1 & 1 \\\\\n",
    "1 & 0\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "该矩阵是一个非方阵，因此不能直接进行特征值分解，但可以进行奇异值分解。\n",
    "\n",
    "---\n",
    "\n",
    "### $2.$ 计算 $A^\\top A$ 并进行特征值分解\n",
    "\n",
    "接下来，计算对称矩阵 $A^\\top A$：\n",
    "\n",
    "$$\n",
    "A^\\top A = \\begin{bmatrix}\n",
    "0 & 1 & 1 \\\\\n",
    "1 & 1 & 0\n",
    "\\end{bmatrix}\n",
    "\\begin{bmatrix}\n",
    "0 & 1 \\\\\n",
    "1 & 1 \\\\\n",
    "1 & 0\n",
    "\\end{bmatrix}\n",
    "= \\begin{bmatrix}\n",
    "2 & 1 \\\\\n",
    "1 & 2\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "然后对 $A^\\top A$ 进行特征值分解（使用 `np.linalg.eigh`），由于 $A^\\top A$ 是对称正半定矩阵，其特征值为实数，且对应的特征向量构成一个正交基。记这些特征值为 $\\lambda_1 \\geq \\lambda_2$，对应特征向量为 $v_1, v_2$，它们将组成矩阵 $V$：\n",
    "\n",
    "$$\n",
    "A^\\top A = V \\Lambda V^\\top\n",
    "$$\n",
    "\n",
    "其中 $\\Lambda = \\mathrm{diag}(\\lambda_1, \\lambda_2)$。\n",
    "\n",
    "---\n",
    "\n",
    "### $3.$ 计算奇异值 $\\sigma_i$\n",
    "\n",
    "奇异值 $\\sigma_i$ 是 $A^\\top A$ 的特征值平方根：\n",
    "\n",
    "$$\n",
    "\\sigma_i = \\sqrt{\\lambda_i}, \\quad i = 1, 2\n",
    "$$\n",
    "\n",
    "代码中通过 `np.sqrt(np.maximum(eigvals_V, 0))` 计算，确保在数值不稳定时不会对负数开根。\n",
    "\n",
    "---\n",
    "\n",
    "### $4.$ 构造矩阵 $S$\n",
    "\n",
    "根据奇异值构造对角矩阵 $S$，其尺寸与 $A$ 一致，即 $3 \\times 2$，主对角线填入奇异值（从大到小排列）：\n",
    "\n",
    "$$\n",
    "S = \\begin{bmatrix}\n",
    "\\sigma_1 & 0 \\\\\n",
    "0 & \\sigma_2 \\\\\n",
    "0 & 0\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "---\n",
    "\n",
    "### $5.$ 计算 $A A^\\top$ 并进行特征值分解\n",
    "\n",
    "类似地，计算另一个对称矩阵 $A A^\\top$：\n",
    "\n",
    "$$\n",
    "A A^\\top = \\begin{bmatrix}\n",
    "0 & 1 \\\\\n",
    "1 & 1 \\\\\n",
    "1 & 0\n",
    "\\end{bmatrix}\n",
    "\\begin{bmatrix}\n",
    "0 & 1 & 1 \\\\\n",
    "1 & 1 & 0\n",
    "\\end{bmatrix}\n",
    "= \\begin{bmatrix}\n",
    "1 & 1 & 0 \\\\\n",
    "1 & 2 & 1 \\\\\n",
    "0 & 1 & 1\n",
    "\\end{bmatrix}\n",
    "$$\n",
    "\n",
    "对 $A A^\\top$ 也进行特征值分解，得到特征向量矩阵 $U$，其列为 $u_1, u_2, u_3$：\n",
    "\n",
    "$$\n",
    "A A^\\top = U \\Lambda' U^\\top\n",
    "$$\n",
    "\n",
    "其中 $\\Lambda'$ 为 $A A^\\top$ 的特征值对角矩阵，且前两个非零特征值与 $A^\\top A$ 一致。\n",
    "\n",
    "---\n",
    "\n",
    "### $6.$ 重构原始矩阵 $A$\n",
    "\n",
    "根据 SVD 定理，任何实矩阵 $A$ 都可表示为：\n",
    "\n",
    "$$\n",
    "A = U S V^\\top\n",
    "$$\n",
    "\n",
    "此处：\n",
    "- $U$ 是 $A A^\\top$ 的特征向量矩阵（$3 \\times 3$）；\n",
    "- $V$ 是 $A^\\top A$ 的特征向量矩阵（$2 \\times 2$）；\n",
    "- $S$ 是构造的 $3 \\times 2$ 矩阵，主对角线上是奇异值。\n",
    "\n",
    "最后，矩阵重构如下：\n",
    "\n",
    "$$\n",
    "A_{\\text{reconstructed}} = U S V^\\top\n",
    "$$\n",
    "\n",
    "该结果应近似等于原始矩阵 $A$，允许浮点误差。\n",
    "\n",
    "---\n",
    "\n",
    "### 总结\n",
    "\n",
    "这段代码利用特征值分解间接构造了 SVD，实现了从：\n",
    "\n",
    "$$\n",
    "A \\longrightarrow U, S, V \\longrightarrow A_{\\text{reconstructed}}\n",
    "$$\n",
    "\n",
    "的完整流程，是理解奇异值分解几何结构的良好起点。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cd92a62d-ba7c-4a9a-a48c-9c7611906310",
   "metadata": {},
   "source": [
    "## 初始化"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "93f67b5b-04ab-4396-bf85-52dac2594924",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bcd7eef3-4d5d-4585-8ed4-dc9182f4cdfc",
   "metadata": {},
   "source": [
    "## 原始矩阵"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "fdc980c0-f40c-4d1d-a9c5-7503484887ad",
   "metadata": {},
   "outputs": [],
   "source": [
    "A = np.array([[0, 1],\n",
    "              [1, 1],\n",
    "              [1, 0]])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5cd7fd49-4746-4e6d-bd53-34140c5c112b",
   "metadata": {},
   "source": [
    "## 计算 A^T A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "7bf1d08f-e9d9-4d78-85b4-b94bcd4bea6c",
   "metadata": {},
   "outputs": [],
   "source": [
    "ATA = A.T @ A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "5b8437b9-b41c-4670-9bb0-d0b84ffb35a7",
   "metadata": {},
   "outputs": [],
   "source": [
    "eigvals_V, V = np.linalg.eigh(ATA)  # eigh 返回升序的特征值和正交特征向量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "8e2bcb02-ad29-4f2a-a89d-ea4dd6fa24b8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1., 3.])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eigvals_V"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "5f66f5db-90f4-4800-9542-a9c9bd6238ed",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 按特征值从大到小排序的索引\n",
    "idx = np.argsort(-eigvals_V)  # 注意是负号，表示降序排序\n",
    "\n",
    "# 排列特征值\n",
    "eigvals_V = eigvals_V[idx]\n",
    "\n",
    "# 按相应顺序排列特征向量的列\n",
    "V = V[:, idx]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "48993c7f-1696-4b33-bf68-71859126fd8f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.70710678, -0.70710678],\n",
       "       [ 0.70710678,  0.70710678]])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "V"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "dca14808-00de-4a6e-992d-e0ff61ff17a4",
   "metadata": {},
   "outputs": [],
   "source": [
    "singular_vals = np.sqrt(np.maximum(eigvals_V, 0))  # 计算奇异值，防止负数取根"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "42244b4f-6245-435a-8d22-61cc9b7b615c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1.73205081, 1.        ])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "singular_vals"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6e88665a-0fcf-405e-805e-a818ce38e535",
   "metadata": {},
   "source": [
    "## 构造 S"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "0690f8ca-3dc1-4695-982b-c9b2507ea231",
   "metadata": {},
   "outputs": [],
   "source": [
    "S = np.zeros_like(A, dtype=float)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "4b3542ad-8ec7-4d5e-8d17-76f32eb9d794",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1.73205081, 0.        ],\n",
       "       [0.        , 1.        ],\n",
       "       [0.        , 0.        ]])"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.fill_diagonal(S, singular_vals)  \n",
    "# 奇异值应按从大到小排列\n",
    "S"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5ba926d8-fd49-4ec8-ad42-233c1dbc1f5d",
   "metadata": {},
   "source": [
    "## 计算 A A^T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "13052aee-aa22-499d-8673-1bbde6985c7a",
   "metadata": {},
   "outputs": [],
   "source": [
    "AAT = A @ A.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "32e3c3da-02af-4433-b3db-40e549e9ac6d",
   "metadata": {},
   "outputs": [],
   "source": [
    "eigvals_U, U = np.linalg.eigh(AAT)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "fae31988-95e7-4d99-977e-d59c60b8c3e5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([9.99658224e-17, 1.00000000e+00, 3.00000000e+00])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eigvals_U"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "aeb85ffa-1cff-4fe6-bce6-9ca04a80a842",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 按特征值从大到小排序的索引\n",
    "idx = np.argsort(-eigvals_U)  # 注意是负号，表示降序排序\n",
    "\n",
    "# 排列特征值\n",
    "eigvals_U = eigvals_U[idx]\n",
    "\n",
    "# 按相应顺序排列特征向量的列\n",
    "U = U[:, idx]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "722a28eb-db7b-4fe7-a78a-c78e9c3a4070",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2, 1, 0], dtype=int64)"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "idx"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "170a27d6-bc90-4d68-9018-c34a02414bbf",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 4.08248290e-01, -7.07106781e-01,  5.77350269e-01],\n",
       "       [ 8.16496581e-01, -7.58447699e-17, -5.77350269e-01],\n",
       "       [ 4.08248290e-01,  7.07106781e-01,  5.77350269e-01]])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "U"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81fdded9-fe05-45fe-bd83-92e37c04da1c",
   "metadata": {},
   "source": [
    "## 用 U @ S @ V.T 重构原始矩阵 A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "6e150e29-a34f-4c69-a260-804ce98036ea",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1., 0.],\n",
       "       [1., 1.],\n",
       "       [0., 1.]])"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A_reconstructed = U @ S @ V.T\n",
    "np.round(A_reconstructed,5)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7689df63-f38e-4767-bff7-adab600f35da",
   "metadata": {},
   "source": [
    "## 使用numpy.linalg.svd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "id": "199086e5-0f13-47f5-a703-a44bddca3611",
   "metadata": {},
   "outputs": [],
   "source": [
    "U, S, V = np.linalg.svd(A, full_matrices = True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "id": "96bad653-5767-48e5-932e-f83ce2528e24",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-4.08248290e-01,  7.07106781e-01,  5.77350269e-01],\n",
       "       [-8.16496581e-01,  5.55111512e-17, -5.77350269e-01],\n",
       "       [-4.08248290e-01, -7.07106781e-01,  5.77350269e-01]])"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "U"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "id": "6481bf4e-8bac-45ab-9b1c-f50b5a294d09",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1.73205081, 1.        ])"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "S"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "2ad17295-34ba-4fb4-80d6-6bcbb90d6e0a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.70710678, -0.70710678],\n",
       "       [-0.70710678,  0.70710678]])"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "V"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "070c3389-8048-43a3-baa7-6666009bce96",
   "metadata": {},
   "source": [
    "作者\t**生姜DrGinger**  \n",
    "脚本\t**生姜DrGinger**  \n",
    "视频\t**崔崔CuiCui**  \n",
    "开源资源\t[**GitHub**](https://github.com/Visualize-ML)  \n",
    "平台\t[**油管**](https://www.youtube.com/@DrGinger_Jiang)\t\t\n",
    "\t\t[**iris小课堂**](https://space.bilibili.com/3546865719052873)\t\t\n",
    "\t\t[**生姜DrGinger**](https://space.bilibili.com/513194466)  "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:base] *",
   "language": "python",
   "name": "conda-base-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
